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(54) Device and method for face image extraction, and recording medium having recorded 
program for the method 



(57) In a broadly-applicable face image extraction 
device and method for defining a face by position and 
size in target images varied in type for face image ex- 
traction at high speed, an edge extraction part (1) ex- 
tracts an edge part from a target image and generates 
an edge image. A template storage part (2) previously 
stores a template composed of a plurality of concentric 
shapes varied in size. A voting result storage part (3) 
has voting storage regions for each size of the concen- 



tric shapes of the template so as to store the result ob- 
tained by voting processing carried out by a voting part 
(4). The voting part (4) carries out the voting processing 
utilizing the template at each pixel in the edge image, 
and stores the result obtained thereby in the corre- 
sponding voting storage region. After the voting 
processing, an analysis part (5) performs cluster evalu- 
ation based on the voting results stored in the voting 
storage regions, and then defines the face in the target 
image by position and size. 
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Description 

BACKGROUND OF THE INVENTION 
Field of the Invention 

[0001] The present invention relates to a device and 
a method for face image extraction, and a recording me- 
dium having recorded a program for carrying out the 
method. More specifically, in image processing, such 
device and method are used to extract, at high speed, 
a face region from a target image utilizing a template to 
define position and size thereof. 

Description of the Background Art 

[0002] As everyone acknowledges, a human face of- 
ten mirrors his/her thinking and feeling, and thus is con- 
sidered a significant factor. In image processing espe- 
cially where handling human images, if such human 
face can be automatically detected and processed to re- 
veal its position and size in a target image, such system 
comes in useful. Here, the target image includes still pic- 
tures and moving pictures, and a person taken therein 
may be both real and artificial created by computer 
graphics, for example. This is the reason for the recent 
attempt i n image processing to extract a face region out 
of any target image on such system. 
[0003] Conventional technologies of such face image 
extraction have been disclosed in Japanese Patent 
Lald-Open Publication No. 9-73544 (97-73544) (herein- 
after, first document) and No. 10-307923 (98-307923) 
(hereinafter, second document), for example. 
[0004] The technology disclosed in ihe first document 
is of finding an approximation of face region by an el- 
lipse. Therein, the ellipse is defined by five parameters 
including center coordinates (x, y) , a radius r, a ratio b 
between major and minor axes, and an angle fl between 
the major axis and an x axis. These parameters are 
changed as appropriate to be optimal in value for face 
image extraction. 

[0005] In the second document, the technology is of 
successively finding face parts (e.g., eyes, nose, 
mouth). 

[0006] In the first document, however, approximation 
requires repeated calculation to change those parame- 
ters (especially the angle 6 takes time). In consideration 
of a face image hardly staying the same, real-time ap- 
proximation is hopeless with the processing capability 
of existing personal computers, so thus is real-time face 
image extraction processing. Also in this technology, 
there has no concern given for a possibility that one im- 
age may include several human faces, and thus appli- 
cability of this technology is considered narrow. 
[0007] In the second document, the technology is not 
available unless otherwise a face region has been de- 
fined by position in an image. Therefore, this is applica- 
ble only to a specific image, resulting in narrow applica- 



bility. 

SUMMARY OF THE INVENTION 

5 [0008] Therefore, an object of the present invention is 
to provide a broadly-applicable device and method for 
defining a face by position and size in images varied in 
type for face image extraction at high speed, and a re- 
cording medium having recorded a program forcarrying 

io out the method. 

[0009] The present invention has the following fea- 
tures to attain the object above. 
[0010] A first aspect of the present invention is direct- 
ed to a face image extraction device' for defining a face 

is in a target image by position and size for extraction, 
comprising: 

an edge extraction part for extracting an edge part 
(pixels outlining a person or face) from the target 

20 image, and generating an image having only the 
edge part (hereinafter, edge image); 
a tempiate storage part for storing a template com- 
posed of a plurality of predetermined concentric 
shapes equal in shape but varied in size; 

25 a voting result storage part for storing, in a interre- 
lating manner, voting values and coordinates of pix- 
els on the edge image for every size of the concen- 
tric shapes of the template; 
a voting part for increasing or decreasing the voting 

30 values of every pixel, specified by the coordinates, 
outlining each of the concentric shapes every time 
a center point of the template moves on the pixels 
in the edge part; and 

an analysis part for defining the face in the target 
35 image by position and size based on the voting val- 
ues stored in the voting result storage part. 

[001 1 ] As described above, in thef irst aspect, the face 
position can be detected at high speed oniy with light- 
loaded voting processing and evaluation of voting val- 
ues. Further, as is utilizing a template composed of con- 
centric shapes varied in size, approximation can be 
done in a practical manner by comparing, in size, an 
edge part presumed to include a face region with the 
is template. Accordingly, size of the face can be detected 
also at high speed. As such, in the face image extraction 
device of the present invention, processing load can be 
considerably reduced, thereby achieving almost real- 
time face region extraction even with the processing ca- 
se pabilities available for the existing personal computers. 
Further, in the first aspect, a face region does not have 
to be defined where and how many in a target image 
prior to extraction, and thus a face can be delected no 
matter what size and type the target image is. Accord- 
55 ingly, applicability of the device considered quite wide. 
' [0012] Herein, preferably, the predetermined concen- 
tric shape is a circle, an ellipse, or a polygon. In such 
case, the circle may improve the voting result in accu- 
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racy as is being constant in distance from a centerpoint 
to each pixel outlining the circle. 
[0013] Preferably, the edge extraction part extracts 
the edge part from the target image by using a filter for 
a high frequency component. 
[0014] Therefore, any high frequency component can 
be obtained by using afiSterforthe target image, where- 
by position and size of a face can be preferably detected 
in a case where the target image is a still picture. 
[001 5] Preferably, when the target image is structured 
by a plurality of successive images, the edge extraction 
part extracts the edge part by comparing a current im- 
age with another image temporally before, and with after 
to calculate a difference therebetween, respectively, for 
every image structuring the target image. 
[0016] in this manner, a current target image is com- 
pared with another temporally before and then with after 
to caicu late a difference the rebetwee n, resp ectively. Ac- 
cordingly, position and size of a face can be preferably 
detected in a case where the target image is a series of 
moving pictures. Further, with the help of a template for 
detection, a face region can be stably extracted at high- 
speed even if facial expression changes to agreaterex- 
tentat zoom-in or close-up, for example. 
[0017] Also preferably, the edge extraction part de- 
tects, With respect to pixels extracted in every predeter- 
mined box, one pixel located far-left end or far-right end 
in the box on a scanning line basis, and regards only the 
pixels detected thereby as the edge part. 
[0018] Inthis manner, any part differed in texture with- 
in contour is prevented from being extracted as the edge 
part. Therefore, the extraction processing can be done, 
at high speed, with respect to the face region. 
[0019] Also preferably, the analysis part performs 
clustering with respect to the voting values stored in 
each of the voting result storage parts, and narrows 
down position and size of the face in the target image. 
[0020] Therefore, even In the case that a target image 
includes several faces, the face region can be extracted 
by clustering the voting results (each voting value) and 
then correctly evaluating correlation thereamong. 
[0021] Also preferably, the face image extraction de- 
vice further comprises an image editing part for editing 
the target image in a predetermined manner by distin- 
guishing a face region defined by position and size in 
the analysis part from the rest in the target image. 
[0022] As such, by editing the target image while dis- 
tinguishing a face region defined by position and size 
from the rest, only a desired part, i.e., a face, can be 
emphasizedandthus become co n sp icuo us in the target 
image. As an example, the target image excluding the 
face region may be solidly shaded, leading to eye-catch- 
ing effects. 

[0023] Still preferably, the face image extraction de- 
vice further comprises an image editing part for replac- 
ing an image of the face region defined by position and 
size by the analysis part with another. 
[0024] As such, the image of the face region can be 



replaced with another. In this manner, the face can be 
intentionally concealed. This works effective, for exam- 
ple, when image-monitoring a person who is suffering 
dementia. In such case, by rep lacing the image of a face 
s with anothe r, privacy can be protected , and a face area 
can ba defined for monitoring. This works also good 
when replacing images of a person's movement with 
other type of character's. 

[0025] A second aspect of the present In vention is di- 
10 reded to a face image extraction method for defining a 
face in a target image by position and size for extraction, 
comprising: 

an extraction step D f extracting an edge part (pixels 

is outlining a person or face) from the target image, 
and generating an image having only the edge part 
(hereinafter, edge image); 
a first storage step of storing a template composed 
of a plurality of predetermined concentric shapes 

so equal in shape but varied in size; 

a second storage step of storing, in a interrelating 
manner, voting values and coordinates of pixels on 
the edge image for every size of the concentric 
shapes of the template; 

25 a voting step of increasing or decreasing the voting 
values of every pixel, specified by the coordinates, 
outlining each of the concentric shapes every time 
a center point of the template moves on the pixels 
in the edge part; and 

30 an analysis step of defining, after the voting step, 
the face in the target image by position and size 
based on the voting values. 

[0026] As described above, In the second aspect, Ihe 
35 face position can be detected at high speed only with 
light-loaded voting processing and evaluation of voting 
values. Further, as is utilizing a template composed of 
concentric shapes varied in size, approximation can be 
done in a practical manner by comparing, in size, an 
40 edge part presumed to include a face region with the 
template. Accordingly, size of the face can be detected 
also athigh speed. As such, in the face image extraction 
device of the present invention, processing load can be 
considerably reduced, thereby achieving almost real- 
ms time face region extraction even with the processing ca- 
pabilities available for the existing personal computers. 
Further, in the second aspect, a face region does not 
have to be defined where and how many in a target im- 
age prior to extraction, and thus a face can be detected 
so no matter what size and type the target image is. Ac- 
cordingly, applicability of the device considered quite 

[0027] Herein, preferably, the predetermined concen- 
tric shape is a circle, an ellipse, or a polygon. 
& [0026] In such case, the circle may improve the voting 
result in accuracy as is being constant in distance from 
a center point to each pixel outlining the circle. 
[0029] Also preferably, In the extraction step, the edge 
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part is extracted from the target image by using a filter 
for a high frequency component. 
[0030] Accordingly, a high frequency component is 
extracted from the target image by using a filter. There- 
fore, position and size of a face can be preferably de- 
tected in a case where the target image is a still picture. 
[0031] Also preferably, when the target image is struc- 
tured by a plurality of successive images, the edge part 
is extracted by comparing a current image with another 
image temporally before, and with after to calculate a 
difference therebetween, respectively, for every image 
structuring the target image. 

[0032] In this manner, a current target image is com- 
pared with another temporally before and then with after 
to calculate a difference therebetween, respectively. Ac- 
cordingly, position and size of a face can be preferably 
detected in a case where the target image is a series of 
moving pictures. Further, with the help of a template for 
detection, a face region can be stably extracted at high- 
speed even if facial expression changes to a greater ex- 
tent at zoom-in or close-up, for example. 
[0033] Also preferably, in the extraction step, with re- 
spect to pixels extracted in every predetermined box, 
one pixel located far-left end or far-right end in the box 
is detected on a scanning line basis, and only the pixels 
detected thereby is regarded as the edge part. 
[0034] As such, any part differed in texture within con- 
tour is prevented from being extracted as the edge part. 
Therefore, the extraction processing can be done, at 
high speed, with respect to the face region. 
[0035] Stili preferably, in the analysis step, clustering 
is performed with respect to the voting values stored in 
each of the voting result storage parts, and position and 
size of the face is narrowed down in the target Image. 
[0036] As such, even in the case that a target image 
includes several faces, the face region can be extracted 
by clustering the voting results (each voting value) and 
then correctly evaluating correlation thereamong. 
[D037] These and other objects, features, aspects 
and advantages of the present invention will become 
more apparent from the following detailed description of 
the present invention when taken in conjunction with the 
accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0038] FIG. 1 is ablock diagram showing the structure 
of a face image extraction device according to one em- 
bodiment of the present invention; 
[0039] FIGS. 2a and 2b are diagrams each showing 
an exemplary structure of an edge extraction part 1 ; 
[0040] FIGS. 3a to 3c are diagrams each showing an 
exemplary edge image extracted by the edge extraction 
part 1; 

[0041] FIGS. 4a to 4c are diagrams each showing an 
exemplary template stored in a template storage part 2; 
[0042] FIG. 5 is a flowchart showing the procedure of 
voting processing carried out by a voting part 4; 



[0043] FIG. 6 is a diagram in assistance of explaining 
the concept of voting values stored, through voting 
processing, in voting storage regions provided in a vot- 
ing result storage part 3; 

s [0044] FIG. 7 is a flowchart showing the procedure of 
analysis processing carried out by an analysis part 5; 
[0045] FIGS. Ba to 8c are diagrams in assistance of 
explaining the concept of clustering processing carried 
out in steps S23 and S24 in FIG. 7; and 

ro [0046] FIGS. 9a to 9c are diagrams showing an ex- 
emplary image edit processing carried out by an image 
editing part 6. 

DESCRIPTION OF THE PREFERRED 
is EMBODIMENTS 

[0047] FIG.1 isablockdiagramshowingthestructure 
of a face image extraction device according to an em- 
bodiment of the present invention, in FIG. 1 , the face 

20 image extraction device of the embodiment includes an 
edge extraction part 1 , a template storage part 2, a vot- 
ing result storage part 3, a voting part 4, an analysis part 
5, and an image editing part 6. 
[0048] Referring to the accompanying drawings, de- 

25 scribed below is the ope ratio n of each compo n ent above 
and a method for face image extraction. 
[0049] The edge extraction part 1 receives an Image 
for face image extraction (hereinafter, target image) , 
and extracts an edge parttherefromto generate another 

30 image having only the edge part (hereinafter, edge 
image) . Here, the edge part is a part (pixels) represent- 
ing contours of human body orface, for example, where 
high in frequency. The target image may be both still and 
moving, and depending on which, a technique applied 

ss for edge part extraction differs. 

[0050] For a still picture, as shown in FIG. 2a, the edge 
extraction part 1 is implemented by a filter 11 which 
takes out only a high frequency component, thereby 
simplifying edge part extraction process. The preferable 

40 type of the filter 11 is a Sobel. 

[0051] For moving pictures, as shown in FIG. 2b, the 
edge extraction part 1 is implemented by a difference 
extraction part 12. Specifically, the difference extraction 
part 12 compares a targeted moving picture with anoth- 

45 er located temporally before and then with after to cal- 
culate a difference therebetween (data difference on 
pixel basis), respectively. Thereafter, any part found 
large in such detected difference (where motion in im- 
ages is active) is extracted as an edge part. 

so [0052] Here, with the above techniques, a part(s) dif- 
fered in texture within contours is extracted aiso as the 
edge part. FIG. 3a shows an exemplary edge image in- 
cluding such unwanted extracted parts. Although this 
causes no problem for the face image extraction device 

55 of the present invention, the following technique is pref- 
erable if processing therein is desired to be faster. 
[0053] First, in such edge image as shown in FIG. 3a, 
any area having edge part rather concentrated is en- 
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closed in a box (FIG. 3b) . The image in the box is then 
subjectedto bi-directionai scanning on scanning line ba- 
sis (FIG. 3b), and any outline formed by pixels each de- 
tected first thereby is determined as being the edge part 
in the target image (FIG. 3c) . In this manner, any part 
differed in texture within contou r is prevented f rom bei ng 
extracted. Any constituent for this processing may be 
provided subsequent to the filter 11 orthe difference ex- 
traction part 12. 

[0054] The template storage part 2 previously stores 
data about a template composed of a plurality of con- 
centric shapes, which are equal in shape but varied in 
size. Here, although the concentric shape may be any 
such as circle, ellipse, regular polygon, or polygon, but 
most preferable is circle. This is because the distance 
from a center point to an outline of the shape, i.e., to 
each pixel outlining thecircle, is always constant, there- 
by improvingthe later-described voting result in accura- 
cy. 

[0055] Here, as shown in FIGS. 4a to 4c, a template 
described in this embodiment is presumed to be com- 
posed of concentriccircles tlto tn (where n is an arbitrary 
integer) each differed in radius from a center point P. As 
for those circles t1 totn, the difference in radius may be 
uniform as is a template T1 of FIG. 4a or irregular as is 
a template T2 of FIG. 4b. Further, those circles of 11 to 
tn maybe outlined by one-dot line (correspond to a pixei 
In a target image) as is the template T2 of FIG. 4b, or 
as a template T3 of FIG. 4c, some or all of those may 
be outlined by two-dot or thicker line (i.e., annular ring 
shape). Hereinafter, a term "circle" means both circle 
and annular ring. 

[0056] The circles tl to tn are stored in the template 
storage part 2 as one template, but practically, handled 
as each independent. Therefore, for each of the circles 
t1 to tn, pixel data is stored in the template storage part 
2 in the form of table, for example. 
[0057] The voting result storage part 3 has regions 
each for the shapes of the template stored in the tem- 
plate storage part 2, in this example, circles 11 to tn. The 
regions (hereinafter, referred to as voting storage re- 
gions) store a result obtained by voting processing car- 
ried out by the voting part 4, which will be described later. 
Herein, the number of voting storage regions provided 
in the voting result storage part 3 is equal to that of cir- 
cles, in this example, n. Note herein that, each voting 
storage region is ol the same size as a target image. 
[0058] As for the edge image generated by the edge 
extracton part 1 , the voting part 4 carries out the voting 
processing utilizing the template slored in the template 
storage part 2. FIG. 5 Is a flowchart showing the proce- 
dure of the voting processing. 
[0059] Referring to FIG. 5, the voting part 4 first ac- 
cesses the voting result storage part 3, and initializes, 
to 0, components (voting values) representing x-y coor- 
dinates in each voting storage region (step S11). There- 
after, the voting part 4 sets the center point P of the tem- 
plate at the head ol pixels in the edge part on the edge 



image (step S12) . To find the head of pixels, the edge 
image is sequentially scanned, vertically or laterally, 
from the upper left. The position of pixel found first in 
trie edge part may be regarded as the head. 
s [0060] The voting part4then initializes, to"1", acoun- 
ter ; indicates which of the shapes of the template (in 
this example, circles t1 totn) (step S13). When the coun- 
ter /' indicates 1 , for example, the voting part 4 uses the 
circle t1 and specifies every component outlining the cir- 
io de t1 on the edge im age by x-y coordinates (step S 1 4) . 
The voting part 4 then adds, i.e., votes, "1" to each of 
the components specified by the x-y coordinates in the 
voting sto rage region provided for the circle t1 in the vot- 
ing result storage part 3. This is the voting processing. 
15 [0061] Thereafter, the voting part 4 increments the 
counter /, / = 2 (step S17) . Since the counter / is now 
indicating the circle t2, the voting part 4 then specifies 
every component outlining the circle t2 by x-y coordi- 
nates (step S1 4). The voting part 4 then adds "1 " again 
20 to each of the components specified by the x-y coordi- 
nates In the voting storage region this time provided for 
the circle t2 in the voting result storage part 3 (step S15). 
[0062] As for the circles t3 to tn, the voting part 4 re- 
peats the voting processing in steps S1 4 and S1 5 in the 
2s same manner as above while incrementing the counter 
/ until /'becomes n (steps S1 6, S1 7) . As such, the voting 
storage regions provided each for the circles t1 to tn 
store the voting result obtained through the voting 
processing carried out atthe head pixel in the edge part. 
30 [0063] Thereafter, the voting part 4 sets the center 
point P of the template at a pixel next to the head pixel, 
and then repeats the processing In steps S13 to S17. 
This is dene for every pixel, a pixel at a time, in the edge 
part on the edge image (steps S18, S19) . In short, the 
35 center point P of the template never misses a single pix- 
el in the edge part for the voting processing. 
[0064] As an example, by subjecting the above-de- 
scribed voting processing tosuch edge image as shown 
in FIG. 3c, n voting storage regions provided in the vot 
40 ing result storage 3 store such voting values as shown 
in FIG. 6. Here, presumably, the edge image shown in 
FIG. 3c is subjected to the above voting processing. For 
the sake of simplicity, FIG. 6 shows a case where the 
voting processing is carried out only at a specific pixel 
fs in the edge part, in each of the voting storage regions 
of FIG. 6, a circle is outlined by the components repre- 
senting x-y coordinates having the voting value of "1". 
Here, since the voting value is accumulated as de- 
scribed in the foregoing, a part where the circles in FIG. 
so 6 are crossing (indicated by a dot) has the larger voting 
value. 

[0065] Accordingly, If the above-described voting 
processing is done to pixels being an edge part repre- 
senting contours of a face approximated by a circle or 
& an ellipse, the voting value is found larger in the vicinity 
of a center point thereof. It means that any part found 
larger in voting value is highly-possible to be the center 
of the face. Such phenomenon of the voting value con- 
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centrating at a specific part becomes noticeable when 
the concentric shape is a circle having a radius equal to 
or almost equal to a minimum width of the edge part. In 
consideration thereof, by determining in which voting 
storage region such phenomenon is conspicuously ob- 
served, the face can be specified by size. This sounds 
similar to generalized Hough transformation. However, 
the face image extraction method of the present inven- 
tion is absolutely different therefrom in a respect thai a 
face region can be specified, simultaneously, by position 
and size. This is implemented by using a template com- 
posed of concentric shapes varied in size. 
[0066] Here, in the voting processing, the compo- 
nents representing x-y coordinates in each voting stor- 
age region may be initialized to a predetermined maxi- 
mum value in step S11 , and then the voting part 4 may 
subtract "1" from each applicable component in step 
S15. If this is the case, any part found smaller in voting 
value is highly-possible to be the center of the face, and 
by determining in which voting storage region such phe- 
nomenon is conspicuously observed, the face can be 
specified by size. 

[0067] Moreover, in step S15, the value adding to or 
subtracting from the voting value is not restricted to "1 ", 
and may be arbitrarily determined. 
[0068] Described next is a technique for specifying a 
face region in a target image according to the voting re- 
sults stored in the voting result storage part 3. 
[0069] Once the voting part 4 completed its voting 
processing, the analysis part 5 refers to the voting re- 
sults stored in the voting result storage part 3 for cluster 
evaluation, and then specifies a face in the target image 
by position and size. FIG. 7 is a flowchart showing the 
procedure of analysis processing carried out by the 
analysis part 5. 

[0070] Referring to FIG. 7, the analysis part 5 first sets 
a counter / , to "1", whose value indicates which of the 
shapes of the template (in this example, circles t1 to tn) 
(step S21). When the counter /indicates 1 , for example, 
the analysis part 5 refers to the voting storage region 
corresponding to the circle t1 forthe voting result stored 
therein. The analysis part 5 then extracts any compo- 
nent whose voting value is exceeding a predetermined 
value of G (e.g., 200) (step S22). This threshold value 
G can be arbitrarily determined based on definition of 
the target image and a desired accuracy for face image 
extraction. The analysis part 5 performs clustering only 
for the extracted component(s) (step S23), and as for 
each clustered region, calculates variance and covari- 
ance values (step S24). )n orderto determine similarlity 
among clustered regions, any of Euclidean squared dis- 
tance, generalized Euclidean squared distance, Maha- 
ranob'is distance, or Minkowski distance may be ap- 
plied. Further, to form clusters, any of SLINK (single link- 
age clustering method), CLINK (complete linkage clus- 
tering method), or UPGMA (unweighted pair-group 
method using arithmetic averages) may be applied. 
[0071] The analysis part 5 then compares the vari- 



ance and covariance values calculated for each clus- 
tered region with a predetermined threshold value of H 
(step S25). If those values are found smaller than the 
threshold value H in step S25, the analysis part 5 re- 
s gards a center point in the face region as a center point 
of the face. Assuming that the counter / indicates "1", 
the size (diameter) of the circle t1 is determined asbelng 
a minor axis in length (step S26) , and a length obtained 
by adding a constant (empirically determined) to the mi- 
io nor axis is as a major axis of the face (step S27). The 
analysis part 5 stores thus determined center point, and 
minorand major axes as the analysis results (stepS28). 
On the other hand, if the variance and covariance values 
are found equal to or larger than the threshold value H, 
is the analysis part 5 determines the center point in the 
region is not a center point of the face, and then the pro- 
cedure moves to the next processing. 
[0072] Thereafter, the analysis part 5 increments the 
counter /, / = 2 (step S30) . Since the counter / is now 
20 indicating the circle t2, the analysis part 5 then refers to 
the voting result stored in the voting storage region cor- 
responding to the circle t2, and then extracts any com- 
ponent whose voting value is exceeding the threshold 
value G (step S22). The analysis part 5 performs clus- 
25 tering only for the extracted component(s) (step S23), 
and as for each clustered region, calculates variance 
and covariance values (step S24). 
[0073] The analysis part 5 then compares the vari- 
ance and covariance values calcuiated for each clus- 
30 tered region with a predetermined threshold value of H 
(step S25). If those values are found smaller than the 
threshold value H in step S25, the analysis part 5 re- 
gards a center point in the face region as a center point 
of the face. Assuming that the counter / Indicates "1 ", 
35 the size of the circle t2 is determined as being a minor 
axis in length (step S26) , and a length obtained by add- 
ing a constant (empirically determined) to the minor axis 
is as a major axis of the face (step S27) . The analysis 
part 5 additionally stores thus determined center point, 
40 and minor and major axes as the analysis results (step 
S28). On the other hand, if the variance and covariance 
values are found equal tD or larger than the threshold 
value H, the analysis part 5 determines the center point 
in the region is not a center point of the face, and then 
45 the procedure moves to the next processing. 

[0074] As for the circles t3 to tn, the analysis part 5 
repeats the analysis processing in steps S22 to S2B in 
the same manner as above while incrementing the 
counter / until / becomes n (steps S29, S30). As such, 
so stored are the analysis results obtained through the 
analysis processing carried out for face image extrac- 
tion forthe voting storage regions provided each forthe 
circles t1 to tn. 

[0075] The analysis results are then outputted to the 
55 image editing part 6. 

[0076] Here, with reference to FIGS. 8a to 8c, the 
clustering carried out in steps S23 and S24 is briefly de- 
scribed. 
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[0077] Assuming herein is that a case where compo- 
nents exceeding the threshold value G in voting value 
{dots in the drawings) are distributed as in FIG. Ba. Clus- 
ter evaluation performed in such case by the analysis 
part 5 is as follows. In the initial clustering, exemplarily 
as in FIG. 8b, four Initial clusters of A, B, C, and D are 
generated. Once those initial clusters were generated, 
then similarity is calculated for every pair of clusters. If 
the calculated similarity is equal to or larger than a pre- 
determined threshold value, the applicable pair of clus- 
ters are combined. In FIG. 8c, exemplarily, the clusters 
C and D are combined, and becomes a cluster E. There- 
after, the clusters A, B, and E are calculated for a vari- 
ance value, and the like, for evaluation. Herein, since 
the cluster A and B are both small in variance value, 
center points thereof are both considered a center of the 
face. The cluster E large in variance value is not con- 
sidered a center of the face. 

[0078] In the case that two or more clusters are de- 
tected by evaluation made based on the variance value, 
for example, determination of a face region may be d one 
as follows: 

First, If the detected clusters share a center point 
and varied in size, aface region is the clusterwhose 
variance value Is minimum; 
Second , If the detected clusters do not share a cent- 
er point and varied in size, all of those are face re- 
gions each differed in position and size; and 
Third, if the detected clusters do not share a center 
point but identical in size, all of those are face re- 
gions differed in position but same in size. 

[Q079] The image editing part 6 receives the analysis 
results (face regions) from the analysis part 5, and re- 
sponds to any requestfor image processing with respect 
to the target image. Utilized herein is aface region being 
distinguishable from the rest by the analysis results. For 
example, the image editing part 6 clips orsolidly shades 
the target image of FIG. Ba, leaving only a face region 
(FIG. 9b). Accordingly, generated thereby is an image 
having only aface emphasized. Alternatively, the image 
of the face region of FIG. 9a can be replaced with an- 
other (e.g., image of other character's face) as shown 
in FIG. 9c. In this manner, the face can be intentionally 
concealed. 

[0080] Note that, the image editing part 6 is appropri- 
ately provided to meet a need for image processing uti- 
lizing the extracted face region, but is not essential for 
the face image extraction device of the present inven- 
tion. 

[0081] As is known from the above, according to the 
face image extraction device and method of the present 
embodiment, face position can be detected at high 
speed only with light-loaded voting processing (basical- 
ly, only addition) and evaluation of voting values. Fur- 
ther, as is utilizing a template composed of concentric 
shapes varied in size, approximation can be done in a 
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practical manner by comparing, in size, an edge part 
presumed to be a face region with the template. Accord- 
ingly, size of thef ace can be detected also at high speed. 
As such, in the face image extraction device of the 

s present invention , processing load can be considerably 
reduced, thereby achieving almost real-time face region 
extraction even with the processing capabilities availa- 
ble for the existing personal computers. 
[0082] Further, with the face image extraction device 

io of the present invention, a face region does not have to 
be defined where and how many in a target image prior 
to extraction, and thus a face can be detected no matter 
what size and type the target image is. Accordingly, ap- 
plicability of the device is considered quite wide. More- 

15 over, even in the case that the target image includes 
several faces, the face region can be extracted by clus- 
tering the voting results and then correctly evaluating 
correlation thereamong. 

[0QB3] Typically, the face image extraction device of 

20 the above embodiment is functionally (face Image ex- 
traction method) implemented by a storage having a 
predetermined program stored therein (e.g., ROM, 
RAM, hard disk) and a CPU (Central Processing Unit) 
carrying out the program. Here, the program may be 

25 provided by a recording medium such as CD-ROM or 
floppy disk. The program may be partially recorded in a 
plurality of recording mediator distribution. 
[0084] It is herein assumed that a part of the program 
is functionally put on various processes or threads (e. 

30 g., DLL) no matter whether the program being a part of 
operating system or not. In such case, even if not storing 
the part of the program, the recording medium is regard- 
ed as the one having recorded the program for carrying 
out the face image extraction method of the present in- 

35 vention, 

[0085] Moreover, described in the foregoing is the ex- 
emplary case that the face image extraction method of 
the present invention is implemented by a stand-alone 
type (FIG. 1), but this is not restrictive and may be im- 

40 plemented by a server-client type. In otherwords, in ad- 
dition to the stand-alone type having only one terminal 
functionally carry outlheface image extraction method, 
the server-client type will also do. Therein, the face im- 
age extraction method is partially or entirely carried out 

4$ functionally by a server or a device on a network con- 
nectable to a terminal being a client. For example, the 
server may be the one functionally carrying out the 
method, and the client has only a WWW browser. In 
such case, information (e.g., template, voting values) is 

so normally on the server, and is distributed to the client 
basically over the network. When the information is on 
the server, a storage in the server is equivalent to the 
"recording medium", and when on the client, a storage 
in the client is equivalent thereto. 

55 [0086] Further, the program carrying outthe face im- 
age extraction method of the present invention may be 
an application written in machine language aftercompi- 
lation, or an intermediate code interpreted by the above 
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process or thread. Or, in a "recording medium", at least 
resource and source codes are stored together with a 
compiler and a linker, which can generate an application 
written in machine language by utilizing such codes. Or, 
in a "recording medium", at least the resource and 
source codes are stored together with an interpreter, 
which can generate an application in the intermediate 
code by utilizing such codes. 

[0087] While the invention has been described in de- 
tail, the foregoing description is in all aspects illustrative 
and not restrictive. It is understood that numerous other 
modifications and variations can be devised without de- 
parting from the scope of the invention. 



Claims 

1 . A face image extraction device for defining a face 
in a target image by position and size for extraction, 
comprising: 

an edge extraction part (1 ) for extracting an 
edge part (pixels outlining a person or face) 
from said target image, and generating an im- 
age having only the edge part (hereinafter, 
edge image) ; 

a template storage part (2) for storing a tem- 
plate composed of a plurality of predetermined 
concentric shapes equal in shape but varied in 

size; 

a voting result storage part (3) for storing, in a 
interrelating manner, voting values and coordi- 
nates of pixels on said edge image for every 
size of the concentric shapes of said template; 
a voting part (4) for increasing or decreasing 
said voting values of every pixel, specified by 
the coordinates, outlining each of said concen- 
tric shapes every time a center point of said 
template moves on the pixels in said edge part; 
and 

an analysis part (5) for defining the face in said 
target image by positio n and size bas ed o n said 
voting values stored in said voting result stor- 
age part (3). 

2. Thefaceimage extraction deviceaccordingtoclaim 
1 , wherein said predetermined concentric shape is 
a circle. 

3. The face image extraction deviceaccordingtoclaim 
1 , wherein said predetermined concentric shape is 
an ellipse. 

4. The face image extraction device according to claim 
1 , wherein said predetermined concentric shape is 
a polygon. 

5. The face image extraction device according to claim 



1, wherein said edge extraction part (1) extracts 
said edge part from said target image by using a 
filter for a high frequency component. 

5 6. The face image extraction deviceaccordingtoclaim 
1 , wherein, when said target image is structured by 
a plurality of successive images, said edge extrac- 
tion part (1) extracts said edge part by comparing a 
current image with another Image temporally be- 

10 fore, and with after to calculate a difference there- 
between, respectively, for every image slructuring 
said target image. 

7. The face image extraction device according to claim 
is 1, wherein said edge extraction part (1) detects, 

with respect to pixels extracted in every predeter- 
mined box, one pixel located far-left end orfar-right 
end inthe box on ascanning line basis, and regards 
only the pixels detected thereby as said edge part. 

8. Theface imageextractlondeviceaccordingtoclaim 
1 , wherein said analysis part (5) performs clustering 
with respect to said voting values stored in each of 
said voting result storage parts (3) , and narrows 

25 down position and size of the face in said target im- 
age. 

9. Theface image extraction deviceaccordingtoclaim 
1, further comprising art image editing part (6) for 

3D editing said target image in a predetermined man- 
ner by distinguishing a lace region defined by posi- 
tion and size in said analysis part (5) from the rest 
in the target image. 

35 10. Theface imageextractiondeviceaccordingtoclaim 
1, further comprising an image editing part (B) for 
replacing an image of theface region defined by po- 
sition and size by said analysis part (5) with another. 

40 11. A face image extraction method for defining a face 
in atargst image by position and size for extraction, 
comprising: 

an extraction step of extracting an edge part 
45 (pixels outlining a person or face) from said tar- 

get image, and generating an imags having on- 
ly the edge part (hereinafter, edge image) ; 
a first storage step of storing a template com- 
posed of a plurality of predetermined concentric 
50 shapes equal in shape but varied in size; 

a second storage step of storing, in a interrelat- 
ing manner, voting values and coordinates of 
pixels on said edge image for every size of the 
concentric shapes of said template; 
55 a voting step of increasing or decreasing said 

voting values of every pixel, specified bytheco- 
ordinates, outlining each of said concentric 
shapes every time a center point of said tem- 
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plate moves on the pixels in said edge part; and 
an analysis step of defining, after said voting 
step, the face in said target image by position 
and size based on said voting values. 

s 

12. The face image extraction method according to 
claim 11, wherein said predetermined concentric 
shape is a circle. 

13. The face image extraction method according to 10 
claim 11, wherein said predetermined concentric 
shape is an ellipse. 

14. The face image extraction method according to 
claim 11, wherein said predetermined concentric is 
shape is a polygon. 

15. The face image extraction device according to claim 
11 , wherein, in said extraction step, said edge part 

is extracted from said target image by using a fitter 20 
f or a h igh f req uency component. 

16. The face image extraction method according to 
claim 1 1 , wherein, in said extraction step, when said 
target image is structured by a plurality of succes- ?s 
sive images, said edge part is extracted by compar- 
ing a current image with another image temporally 
before, and with after to calculate a difference ther- 
ebetween, respectively, for every image structuring 
said target image. 30 

17. The face image extraction method according to 
claim 11, wherein, In said extraction step, with re- 
spect to pixels extracted in every predetermined 
box, one pixel located far-left end or far-right end in 35 
the box is detected on a scanning line basis, and 
only the pixels detected thereby is regarded as said 
edge part. 

1B. The face image extraction method according to *o 
claim 11, wherein, in said analysis step, clustering 
is performed with respect to said voting values 
stored in each of said voting result storage parts, 
and position and size of the face is narrowed down 
in said target image. 45 

19. A recording medium having recorded a face image 
extraction method for defining a face in a target im- 
age by position and size as a program executable 
on a computer device, the program at least com- so 
prising: 

a extraction step of extracting an edge part (pix- 
els outlining a person or face) from said target 
image, and generating an image having only 55 
the edge part (hereinafter, edge image); 
a first storage step of storing a template com- 
posed of a pi urality of predetermined concentric 



shapes equal in shape but varied In size; 
a second storage part of storing, in a interrelat- 
ing manner, voting values and coordinates of 
pixels on said edge image for every size of the 
concentric shapes of said template; 
a voting step of increasing or decreasing said 
voting values of every pixel outlining each of 
said concentric shapes every time a center 
point of said template moves on the pixel in said 
edge part; and 

an analysis step of defining, after said voting 
step, the face in said target image by position 
and size based on said voting values. 
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